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00 ' Abstract. We consider the problem of approximating tlie brancli and size dependent demand of a fashion 

C 2 ^ , discounter with many branches by a distributing process being based on the branch delivery restricted to integral 

C " !) . multiples of lots from a small set of available lot-types. We propose a formalized model which arises from a 

^^ ' practical cooperation with an industry partner. Besides an integer linear programming formulation and a primal 

; , ' heuristic for this problem we also consider a more abstract version which we relate to several other classical 

O , optimization problems like the p -median problem, the facility location problem or the matching problem. 
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1. Introduction 



Usually, fashion discounters can only achieve small profit margins. Their economic success depends 
mostly in the ability to meet the customers' demands for individual products. More specifically: offer 
r^ . exactly what you can sell to your customers. This task has two aspects: offer what the customers would 

jrt I like to wear (attractive products) and offer the right volumes in the right places and the right sizes (demand 

consistent branch and size distribution). 

In this paper we deal with the second aspect only: meet the branch and size specific demand for products 
as closely as possible. Our industry partner is a fashion discounter with more than 1 000 branches most of 
whose products are never replenished, except for the very few "never-out-of-stock"-products (NOS prod- 
ucts): because of lead times of around three months, apparel replenishments would be too late anyway. In 
^^ ' most cases the supplied items per product and apparel size lie in the range between 1 and 6. Clearly there 

■r^lj- . are some difficulties to determine a good estimate for the branch and size dependent demand, but besides a 

few practical comments on this problem we will blind out this aspect of the problem completely. 
■.— j- i The problem we deal with in this article comes from another direction. Our business partner is a dis- 

f^ ' counter who has a lot of pressure to reduce its costs. So he is forced to have a lean distribution logistics that 

00 . works efficiently. Due to this reason he, on the one hand, never replenishes and, on the other hand, tries to 

reduce the distribution complexity. To achieve this goal the supply of the branches is based on the delivery 
of lots, i.e., pre-packed assortments of single products in various sizes. Every branch can only be supplied 
with an integral multiple of one lot-type from a rather small number of available lot-types. So he has to face 
an approximation problem: which (integral) multiples of which (integral) lot-types should be supplied to a 
C^ ' branch in order to meet a (fractional) mean demand as closely as possible? 

We call this specific demand approximation problem the lot-type design problem (LDP). 

1.1. Related Work. The model we suggest for the LDP is closely related to the extensively studied p- 
median- and the facility location problem. These problems appear in various applications as some kind 
of clustering problems. Loads of heuristics have been applied onto them. Nevertheless the first constant- 
factor approximation algorithm, based on LP rounding, was given not until 1999 by Charikar, Guha, Tardos, 
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and Shmoys IS). We will give some more detailed treatment or literature of approximation algorithms and 
heuristics for the p-median- and the facility location problem in Subsection l4.1l 



1 .2. Our contribution. In cooperation with our business partner, we identified the lot-type design problem 
as a pressing real-world task. We present an integer linear program (ILP) formulation of the LDE that 
looks abstractly like a p-median problem with an additional cardinality constraint. We call this problem the 
cardinality constrained p-median problem (Card-p-MP). To the best of our knowledge, the Card-p-MP has 
not been studied in the literature so far 

Although the ILP model can be solved by standard software on a state-of-the-art PC in reasonable time, 
the computation times are prohibitive for the use in the field, where interactive decision support on a laptop 
is a must for negotiations with the supplier Therefore, we present a very fast primal any-time heuristics, 
that yields good solutions almost instantly and searches for improvements as long as it is kept running. We 
demonstrate on real data that the optimality gaps of our heuristics are mostly way below 1 %. At the moment 
these heuristics are in test mode. 

1 .3. Outline of the paper. In Section|2] we will briefly describe the real world problem, which we will for- 
malize and model in Section[3] In Section|4]we will present its abstract version, the cardinality constrained 
p-median problem (Card-p-MP). Besides a formalized description we relate it to several other well known 
optimization problems like the matching problem, the facility location problem, or the p-median problem. In 
Section|5]we present a primal heuristic for the Card-p-MP, which we apply onto our real world problem. We 
give some numerical data on the optimality gap of our heuristic before we draw a conclusion in Section|6] 

2. The real world problem 

Our industry partner is a fashion discounter with over 1 000 branches. Products can not be replenished, 
and the number of sold items per product and branch is rather small. There are no historic sales data for 
a specific product available, since every product is sold only for one sales period. The challenge for our 
industry partner is to determine a suitable total amount of items of a specific product which should be 
bought from the supplier. For this part the knowledge and experience of the buyers employed by a fashion 
discounter is used. We seriously doubt that a software package based on historic sales data can do better. 

But there is another task being more accessible for computer aided forecasting methods. Once the total 
amount of sellable items of a specific product is determined, one has to decide how to distribute this total 
amount to a set of branches B in certain apparel sizes with in general different demands. There are some 
standard techniques how to estimate branch- and size-dependent demand from historic sales data of related 
products, being, e.g., in the same commodity group. We will address the problem of demand forecasting 
very briefly in Subsection 13. II But let us assume for simplicity that we know the exact (fractional) branch 
and size dependent mean demands for a given new product or have at least good estimates. 

Due to cost reasons, our industry partner organizes his distribution process for the branches using a central 
warehouse. To reduce the number of necessary handholds in the distributing process he utilizes the concept 
of lots, by which we understand a collection of some items of one product. One could have in mind different 
sizes or different colors at this point. To reduce the complexity of the distribution process also the number 
of used lot-types, e.g., different collections of items, is limited to a rather small number. 

One could imagine that the branch- and size-dependent demand for a specific product may vary broadly 
over the large set of branches. This is at least the case for the branches of our industry partner The only 
flexibility to satisfy the demand in each single branch is to choose a suitable lot-type from the small sets 
of available lot-types and to choose a suitable multiplier, i.e., how many lots of a chosen lot-type a specific 
branch should get. One should keep in mind that we are talking about small multipliers here, i.e., small 
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branches will receive only one lot, medium sized branches will receive two lots, and very big branches will 
receive three lots of a lot-type with, say, six items. 

The cost reductions by using this lot-based distribution system are paid with a lack of possibility to 
approximate the branch and size-dependent demand. So one question is, how many different lot-types one 
should allow in order to be able to approximate the branch- and size-dependent demand of the branches up 
to an acceptable deviation on the one hand and to avoid a complex and cost intensive distribution process in 
the central warehouse on the other hand. But also for a fixed number of allowed lot-types the question of the 
best possible approximation of the demand by using a lot-based supply of the branches arises. In other words 
we are searching for an optimal assignment of branches to lot-types together with corresponding multipliers 
so that the deviation between the theoretical estimated demand and the planned supply with lots is minimal. 
This is the main question we will focus on in this paper. 

3. Mathematical modeling of the problem 

In this section we will prescind the real world problem from the previous section and will develop an 
formulation as a well defined optimization problem. Crucial and very basic objects for our considerations 
are the set of branches S, the set of sizes S (in a more general context one could also think of a set of variants 
of a product, like, e.g., different colors), and the set of products 7. 

In practice, we may want to sell a given product p e T only in some branches !Bp C 23 and only in some 
sizes Sp C S (clearly there are different types of sizes for, e.g., skirts or socks). To model the demand of a 
given branch b G Sp for a given product p G CP we use the symbol ri b ,p , by which we understand a mapping 
(Pb,p from the set of sizes §p into a suitable mathematical object. This object may be a random variable 
or simply a real number representing the mean demand. In this paper we choose the latter possibility. For 
the sake of a brief notation we regard rib, p as a vector ((pb,p { Si I ) <Pb,p (512) •■• "Pb.p (si^)) GW, 
where we assume that 8 = {si , . . . , St) and Sp = {si, , . . . , Si^} with ij < ij + i for all j e {1, . . . ,r — 1}. 

3.1. Estimation of the branch- and size-dependent demand. For the purpose of this paper, we may as- 
sume that the demands r|b,p are given, but, since this is a very critical part in practice, we would like to 
mention some methods how to obtain these numbers. Marketing research might be a possible source. An- 
other possibility to estimate the demand for a product is to utilize historic sales information. We may assume 
that for each product p which was formerly sold by our retailer, each branch b G 23, each size s G S and 
each day of sales d we know the number Tb,p(d, s) of items which where sold in branch b of product p in 
size s during the first d days of sales. Additionally we assume, that we have a set It C T of formerly sold 
products which are in some sense similar (one might think of the set of jeans if our new product is also a 
jeans) to the new product p. By Ub.s we denote the subset of products in U, which were traded by a positive 
amount in size s in branch b and by Xb,s (p) we denote a characteristic function which equals 1 if product p 
is distributed in size s to branch b, and equals otherwise. For a given day of sales d the value 

Tb,u{d,s)- Y. L Xb',s'{u) 

Tlb,p,d(sj := TTT — ; > ^ ^— rr~n (1) 

uGUb.s b'eSps'eSp 

might be a useable estimate for the demand rib,p (s), after choosing a suitable scaling factor c G M so that 
the total estimate demand 

bGSp SiESp 
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over all branches and sizes equals the total requirements. We would like to remark that for small days of sale 
d the quality of the estimate fib,p,d(s) suffers from the fact that the stochastic noise of the consumer behavior 
is to dominating and for large d the quality of the estimate suffers from the fact of stockout-substitution. 

There are parametric approaches to this problem in the literature (like Poisson-type sales processes). In 
the data that was available to us, we could not verify the main assumptions of such models, though (not even 
close). 

In our real world data set we have observed the fact that the sales period of a product (say, the time by 
which 80 % of the supply is sold) varies a lot depending on the product. This effect is due to the attractiveness 
of a given product (one might think of two T-shirts which only differ in there color, where one color hits 
the vogue and the other color does not). To compensate this effect we have chosen the day of sales d in 
dependence of the product u G Kb ,$ ■ More precisely, we have chosen du so that in the first du days of sales 
a certain percentage of all items of product u where sold out over all branches and sizes. 

Another possibility to estimate the demand is to perform the estimation for the branch-dependent demand 
aggregated over all sizes and the size-dependent demand for a given branch separately. 

More sophisticated methods of demand estimation from historic sales based on small data sets are, e.g., 
described in |fT9l l20l . Also research results from forecasting NOS (never-out-of-stock) items, see, e.g., 
||Tl[l7]|24l for some surveys, may be utilized. Also quite a lot of software -packages for demand forecasting 
a available, see f3T\ for an overview. 

3.2. Supply of the branches by lots. To reduce handling costs in logistic and stockkeeping our business 
partner orders his products from its external suppliers in so called lots. These are assortments of several items 
of one product in different sizes which form an entity. One could have a set of T-shirts in different sizes in 
mind which are wrapped round by a plastic foil. The usage of lots has the great advantage of reducing the 
number of picks during the distribution process in a high-wage country like Germany, where our partner 
operates. 

Let us assume that the set of sizes for a given product p is given by §p — {si, , . . . , Si^} with ij < ij+i 

for all i e {1 r — 1 }. By a lot-type I we understand a mapping q) : §p — ^ N, which can also be denoted 

by a vector (cp (si, ) cp (si,^ ) ... cp (si^ )) of non-negative integers. 

By £j we denote the set of applicatory lot-types. One could imagine that a lot of a certain lot-type should 
not contain too many items in order to be manageable. In the other direction it should also not contain too 
few items in order to make use of the cost reduction potential of the lot idea. Since the set of applicatory 
lot-types may depend on a the characteristics of a certain product p we specialize this definition to a set 
Lp of manageable lot-types. (One might imagine that a warehouseman can handle more T-shirts than, e.g., 
winter coats; another effect that can be modeled by a suitable set of lot-types is to enforce that each size 
in §p is supplied to each branch in Sp by a positive amount due to juridical requirements for advertised 
products.) 

To reduce the complexity and the error-proneness of the distribution process in a central warehouse, each 
branch b G 23p is supplied only with lots of one lot-type lb ,p G i^p . We model the assignment of lot-types 
I e iLp to branches b £ CBp as a function cUp : Sp ^ iLp , b i-^ lb ,p . Clearly, this assignment tUp is a 
decision variable which can be used to optimize some target function. The only flexibility that we have to 
approximate the branch-, size- and product dependent demand rjb.p by our delivery in lots is to supply an 
integral multiple of TTib.p items of lot-type cup(b) to branch b. Again, we can denote this connection by 
a function m.p : Sp ^ N, b h^ Tab, p. Due to practical reasons, also the total number |ujp ('Bp)| of used 
lot-types for a given product is limited by a certain number k. 

3.3. Deviation between supply and demand. With the notation from the previous subsection, we can 
represent the replant supply for branch b with product p as a vector rap (b) • cUp (b) G N^. To measure the 
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deviation between the supply nvp (b) • tUp (b) and the demand r|b,p we may utiUze an arbitrary vector norm 
II • II . Mentionable vector norms in our context are the sum of absolute values 

r 
II (v, V2 ... Vr) 111 := Y \Vi\ , 

i=1 



the maximum norm 
and the general p-norm 



(vi V2 ... V,-) lloo :=max{|vi| : 1 s^ 1 ^r}, 



II (vi V2 ... Vr) 11^ 



\i 



L 



|Vt 
1 



for real numbers p > 0, which is also called the Euclidean norm for p = 2. With this we can define the 
deviation 

c^b.i.m := lhb,p -m-l||^ 
between demand r|b,p and supply ra G {!,..., M} =: M C N times lot-type I G XLp for each branch 
b G 25p and an arbitrary norm || ■ \\i, for a given product p G CP. It depends on practical considerations which 
norm to choose. The || • || i -norm is very insensitive in respect to outliers in contrast to the || • ||oo-norm which 
is absolutely sensitive with respect to outliers. A possible compromise may be the Euclidean norm || • II2, but 
for most considerations we choose the || • || 1 -norm because of its robustness. (We do not trust every single 
exact value in our demand forecasts that much.) 

For given functions rap and cUp we can consider the deviation vector 

^p := (CTbi ,a)p(bi ),mp(bi ) '''b2 ,cUp (b2 ),mp (b2 ) ••• '''bq ,a)p (b<, ) .rrip (b, ) j 

if the set of branches is written as Sp := {b 1 , . . . , bq }. To measure the total deviation of supply and demand 
we can apply an arbitrary norm || • ||^, which may be different from the norm to measure the deviation of a 
branch, onto Lp. In this paper we restrict ourselves on the || ■ || 1 -norm, so that we have 

ll^plh = 2_ °'b,tOp(b),mp(b)- 
beSp 

3.4. The cardinality condition. For a given assignment tUp of lot-types to branches and corresponding 
multiplicities lUp then quantity 

I:= Y_ mp(b).||a)p(b)|Ii gN 
beSp 

gives the total number of replant distributed items of product p over all sizes and branches. From a practical 
point of view we introduce the condition 

I s^ I ==; T, (2) 

where I, I are suitable integers. One might imagine that our retailer may buy a part of already produced 
products so that there is a natural upper bound I or that there are some minimum quantities. Another 
interpretation may be that the buying department of our retailer has a certain idea on the value of I but is 
only able to give an interval [l, T] . 

During our cooperation with our busines partner we have learned that in practice you do not get what you 
order. If you order exactly I items of a given product you will obtain I plus minus some certain percentage 
items in the end. (And their actually exists a certain percentage up to which a retailer accepts a deviation 
between the original order and the final delivery by its external suppliers as a fulfilled contract.) 



6 CONSTANTIN GAUL, SASCHA KURZ, AND JORG RAMBAU 

Besides these and other practical reason to consider an interval [l , l\ for the total number of items of a 
given product, there are very strong reasons not to replace Inequalities (|2|l by an equation, as we will explain 
in the following. Let us consider the case where our warehouse (or our external suppliers in a low-cost- 
country) is only able to deal with a single lot-type per product. This is the case k = 1 . Let us further assume 
that there exists a rather small integer k (e.g. k — 20) fulfilling ||l||i ^ k for all I e £p. If I contains a 
prime divisor being larger than k, then there exist no assignments multiplicities rap G N (cUp is a constant 
function due to k = 1 ) which lead to a feasible solution of our problem. These number-theoretic influences 
are somewhat ugly. In some cases the lead to the infeasibility of our problem or to bad solutions with respect 
to the quality of the demand-supply approximation in comparison to a relaxed version of the problem, where 
the restrictions on I are weaker One could have in mind the possibility of throwing one item into the garbage 
if this will have a large impact on the quality of the demand-supply approximation. 

In Equation ^ for the demand estimation we have used a certain number I for the total number of items 
to scale the demands r|b,p by a factor c. From a more general point of view it may also happen that the total 
demand 

beSp seSp 

is not contained in the interval [l,l]. In this case the \\ ■ ||i-norm may not be very appropriate. In our 
estimation process, however, the demand forecasts in fact yield demand percentages rather than absolute 
numbers. The total volume is then used to calculate the absolute (fractional) mean demand values, so that in 
our work-flow the total demand is always in the target interval. 



3.5. The optimization problem. Summarizing the ideas and using the notations from the previous subsec- 
tions we can formulate our optimization problem in the following form. We want to determine an assignment 
function tUp : 23p ^ £p and multiplicities rap : Sp — > M = (1 , . . . , M} C N such that the total deviation 
between supply and demand 

/ CTb.tUptbl.mplb) (3) 

beSp 
is minimized with respect to the conditions 

|a)p(Sp)|^K (4) 

and 

I ^ Y. mp(b).||cup(b)||i s; T. (5) 

beSp 

We use binary variables Xb,i,m, which are equal to 1 if and only if lot-type I e £p is delivered with 
multiplicity m. G M to Branch b, and binary variables yi, which are 1 if and only if at least one branch 
in 23p is supplied with Lottype I G XLp . With this, we can easily model out problem as an integer linear 
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program: 



min ^ ^ Y- '^b.l.m-Xb.l.m (6) 

beSp iGip meJVt 

s.t. Y_ T. ^b.i."^ = '' Vb e Sp (7) 



le£,„ meM 



XXX T^-ll^lll •Xb.l.m^I (8) 



bGSp lG£.p meM 
XXX ^-ll^lll •''b.l.m^i (9) 

beLp le-f^p TRGJvt 

X^'b.i.m^iJi VbeSpVleiLp (10) 

ie£.p 

xb,i,me{0,l} Vb eSpVleilpVmeM (12) 

1)1 £{0,1} VleiLp (13) 



The objective function (|6]l represents the sum (|3]l, since irrelevant tuples (b, 1, ra) may be downtroddened 
by Xb,i,m = 0- Condition (|7| states that we assign for each Branch b exactly one lot-type with a unique 
multiplicity. The cardinality condition (|5]l is modeled by Conditions ^ and (|9]l and the restriction ^ on 
the number of used lot- types is modeled by Condition ( fTTT i. The connection between the Xb,i,m and the 
yi is fixed in the usual Big-M condition ( fTOb . We would like to remark that the LP-relaxation of this ILP 
formulation is very strong above all in comparison to the more direct ILP formulation, where we assume the 
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branch deviation between supply and demand is measured by the jj • || i -norm: 

min ^ ^ Zb,s 

beSp seSp 

s.t. 



r|b,p(s) -ab,s < Zb 


,s 


Vb e ®p Vs e Sp 


OCh.s -Tlb.pfs) < Zb 


,s 


Vb e Sp Vs e Sp 


Y_ Y- ''b.i.'Ti = 


1 


VbeSp 


iGCp meM 






z z z- 


l|i|h 


•Xb,l,m ^ I 


beSp ie£p meM 






z z z- 


l|i||i 


•Xb,l,m >l 


beSp leLp meM 






y Xb,i,m ^yi 




Vb e Sp VI e £p 


meM 






y yii^K 






le^p 






^ ^ m-l[s] -Xb.i, 


TT^ = ab,s VbeSpVseSp 


ie£p meM 






Xb,i,m e{0,l} 


Vb 


e Sp VI £ £p Vm e M 


yie{0,l} 


VI 


eilp 


ab,s e K+ 


VbeSpVseSp, 



where l[s] is the entry in Vector I corresponding to Size s. 

We would like to remark that our strong ILP formulation of the problem of Subsection [33] can be used 
to solve all real world instances of our business partner in at most 30 minutes by using a standard ILP 
solver like CPLEX 11. Unfortunately, this is not fast enough for our real world application. The buyers of 
our retailer need a software tool which can produce a near optimal order recommendation in real time on 
a standard laptop. The buying staff travels to one of the external suppliers to negotiate several orderings. 
When they get to the details, the buyer inserts some key data like I, I, Sp, §p, and Lp into his laptop and 
immediately wants a recommendation for an order in terms of multiples of lot-types. For this reason, we 
consider in Section |5] a fast heuristic, which has only a small gap compared to the optimal solution on a test 
set of real world data of our business partner. 

4. The Cardinality Constrained p-Median Problem 

In the previous section we have modeled our real world problem from Section|2] Now we want to abstract 
from this practical problem and formulate a more general optimization problem which we will relate to 
several well known optimization problems. 

For the general Cardinality Constrained p-Median Problem let p be an integer, § a set of chooseable 
items, D a set of demanders, a demand function 8:1)^ R+, and [l.T] C 3\f an interval. We are looking 
for an assignment o) : CD ^ S with corresponding multipliers ra : CD ^ N, such that the sum of distances 

Y_ 116(d) -m(d)-cu(d)|| 
den 
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is minimized under the conditions 

|a)(2))|«;p 
and 

I s$ ^ m(d) • |cu(d)| sC T. 

deD 

Let us now bring this new optimization problem in Une with known combinatorial optimizations prob- 
lems. Since we have to choose an optimal subset of § to minimize a cost function subject to some constraints 
the cardinality constrained p-median problem belongs to the large class of generic selection problems. 

Clearly, it is closely related to the p-median problem. The only characteristics of our problem that are 
not covered by the p-median problem are the multipliers m and the cardinality condition. If we relax the 
cardinality condition we can easily transform our problem into a classical p-median problem. For every 
element d G 2) and every element s G S there exists an optimal multiplier ma , s such that ||6(d)— rad,s-s|| 
is minimal. 

If we do not bound |ai(D]| from above but assign costs for using elements of § instead, which means 
using another lot-type in our practical application, we end up with the facility location problem. Clearly we 
also have some kind of an assignment-problem, since the have to determine an assignment w between the 
sets "D and a subset of §. 

One can also look at our problem from a completely different angle. Actually we are given a set of |!B| 
real-valued demand-vectors, which we want to approximate by a finite number of integer-valued vectors 
using integral multiples. There is a well established theory in number theory on so called Diophantine 
approximation f?, Ti] or simultaneous approximation, which is somewhat related to our approximation 
problem. Here one is interested in simultaneously minimizing 

Vi 

q 

for linearly independent real numbers oci by integers pi and q Il27ll22]| . One might use some results from 
this theory to derive some bounds for our problem. One might also have a look at |9|. 

For a more exhaustive and detailed analysis of the taxonomy of the broad field of facility-location prob- 
lems and their modeling we refer to |26|. 

4. 1 . Approximation algorithms and heuristics for related problems. Facility location problems and the 
p-median problem are well known and much research has been done. Since, moreover, these problems 
are closely related to our optimization problem, we would like to mention some literature and methods on 
approximation algorithms and heuristics for these problems. 

Lin and Vitter |23 1 have developed a filtering and rounding technique which rounds fractional solutions of 
the standard LP for these problems to obtain good integer solution. For the metric case some some bounds 
for approximation quality are given. Based on this work some improvements were done in |28|, were 
the authors give a polynomial-time 3.16-approximation algorithm for the metric facility location problem, 
and Els], where the authors give a polynomial-time ^-approximation algorithm for the metric p-median 
problem and a 9.8-approximation algorithm for the p-facility location problem. 

Besides Rounding techniques of LP-solutions also greedy techniques have been applied to the facility 
location problem and the p-median problems. Some results are given in lfT2l [Tsl [161 . Since these problems 
are so prominent in applications the whole broadness of heuristics are applied onto it. Examples are scatter 
search 1. 10. . 8], local search IJIHE], and neighborhood search inTl [T4l . 

Good overviews for the broad topic of approximation algorithms and heuristics for the facility location 
and the p-median problem are given in 11281 1291 171 1251 . 

Besides results for the metric case there are also results for the non-metric case, see, e.g., ||30ll . 
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Unfortunately, none of the theoretical guarantees seems to survive the introduction of the cardinality 
constraint in general. 

5. A PRACTICAL HEURISTIC FOR THE CARDINALITY CONSTRAINED p-MEDIAN PROBLEM 

As already mentioned in Section |3] solving our ILP formulation of our problem is too slow in practical 
applications. So there is a real need for a fast heuristic which yields good solutions, which is the top of this 
section. 

In Section|4]we have analyzed our problem from different theoretical point of views. What happens if we 
relax some conditions or fix some decisions. A very important decision is: which lot- types should be used 
in the first place? Here one should have in mind that the cardinality \Lp\ of the set of feasible lot-types is 
very large compared to the number k of lot-types which can be used for the delivery process of a specific 
product p. 

5.1. Heuristic selection of lot-types. For this selection problem of lot-types we utilize a scoring method. 
For every branch b e Sp with demand rjb.p there exists a lot-type I G £p and a multiplicity ra G N such 
that ||r|b,p — TTL-l|| is minimal in the set {||rib,p — ta' -I'll : I' G £p, ra' G N}. So for every branch b G "Bp 
there exists a lot-type that fits best. More general, for a given k ^ \Lp\ there exist lot-types li , . . . , l^ such 
that li fits i-best if one uses the corresponding optimal multiplicity. Let us examine this situation from the 
point of view of the different lot-types. A given lot-type I G iLp is the i-best fitting lot-type for a number 
Pii of branches in Sp. Writting these numbers p; t as a vector pi G N'' we obtain score vectors for all 
lot- types I G ilp. 

Now we want to use these score vectors pi to sort the lot-types of Zp in decreasing approximation 
quality. Using the lexicographic ordering ^ on vectors we can determine a bijective rank function A : iLp — > 
{!,..., \Lp\}. (We simply sort the score vectors according to ^ and for the case of equality we choose an 
arbitrary succession.) We extend A to subsets L' C Lp by A(iL') — ^ A(l) G N. 

let' 

To fix the lot-types we simply loop over subsets L' C iLp of cardinality k in decreasing order with 

respect to A(iL'). In principle we consider all possible selections L' of k lot-types, but in practise we stop 
our computations after a adequate time period with the great advantage that we have checked the in some 
heuristic sense most promising selections £' first. 

Now we have to go into detail how to efficiently determine the p best fitting lot-types with corresponding 
optimal multiplicities for each branch b G Sp . We simply loop over all branches b G CBp and determine the 
set of the p best fitting lot-types separately. Here we also simply loop over all lot-types I G Lp and determine 
the corresponding optimal multiplier ra by binary search (it is actually very easy to effectively determine 
lower and upper bounds for ra from rib,p and I) due to the convexity of norm functions. Using a heap data 
structure the sorting of the p best fitting lot-types can be done in 0(|iLp|) time if k log k G 0(|iLp|), which is 
not a real restriction for practical problems. We further want to remark that we do not have to sort the score 
vectors completely since in practice we will not loop over all (' ^') possible selections of lot-types. If one 
does not want to use a priori bounds (meaning that one excludes the lot-types with high rank A) one could 
use a lazy or delayed computation of the sorting of A by utilizing again a heap data structure. 

5.2. Adjusting a delivery plan to the cardinality condition. If we determine assignments cUp with cor- 
responding multipliers rap with the heuristic being described in Subsection 15. II in many cases we will not 
satisfy the cardinality condition dU since it is totally unaccounted by our heuristic. Our strategy to satisfy 
the cardinality condition (O is to adjust rap afterwards by decreasing or increasing the calculated multipliers 
unless condition ^ is fulfilled by pure chance. 
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Here we want to use a greed algorithm and have to distinguish two cases. If liwp , rrvp ) is smaller then 
I, then we increase some of the values of rap, other wise we have I(a)p , rap ) > I and we decrease some of 
the values of rap . Our procedure works iteratively and we assume that the current multipliers are given by 
fn.p. Our stopping criteria is given by I ^ I(ujp , fup ) ^ T or that there are no feasible operations left. We 
restrict our explanation of a step of the iteration to the case where we want to decrease the values of frip . 
For every branch b £ "Bp the reduction of ffip (b) by one produces costs 



^b - 0'b,cup(b),mp(b)-l 



-ff>, 



if the reduction of fflp ( b ) by one is allowed (a suitable condition is TUp ^ 1 or frip ^ 2) and A,^ = oo if we 
do not have the possibility to reduce the multiplier rup (b) by one. A suitable data structure for the A^ values 
is a heap, for which the update after an iteration can be done in 0(1) time. If we reach I ( cUp , TUp ) < I at 
some point, we simply discard this particular selection cOp and consider the next selection candidate. 

Since this adjustment step can be performed very fast one might also take some kind of general swap 
techniques into account. Since for these techniques there exists an overboarding amount of papers in the 
literature we will not go into detail here, but we would like to remark that in those cases (see Subsection l5.3b 
where the optimality gap of our heuristic lies above 1 % swapping can improve the solutions of our heuristic 
by a large part. 

5.3. Optimality gap. To substantiate the usefuUness of our heuristic we have compared the quality of the 
solutions given by this heuristic after one second of computation time (on a standard laptop) with respect to 
the solution given byCPLEX 11. 

Our business partner has provided us historic sales information for nine different commodity groups each 
ranging over a sales period of at least one and a half year. For each commodity group we have performed a 
test calculation for k e {1,2,3,4,5} distributing some amount of items to almost all branches. 

Commodity group 1: 

Cardinality interval: [1 0630, 1 1 749] 
number of sizes: |Sp| =5 
number of branches: |23p| = 1119 





K = 1 


K=2 


K = 3 


K=4 


K=5 


CPLEX 


4033.34 


3304.10 


3039.28 


2951.62 


2891.96 


heuristic 


4033.85 


3373.95 


3076.55 


3011.49 


2949.31 


gap 


0.013% 


2.114% 


1.226% 


2.028% 


1.983% 



Table 1 . Optimality gap in the || • || i -norm for our heuristic on commodity group 1 



Commodity group 2: 

CardinaHty interval: [10000, 12000] 
number of sizes: |Sp| =5 
number of branches: |23p| — 1091 
Commodity group 3: 
CardinaHty interval: [9785, 10815] 
number of sizes: |§pj =5 
number of branches: \'Bp\ — 1030 
Commodity group 4: 
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K = 1 


K=2 


K = 3 


K=4 


K=5 


CPLEX 


2985.48 


2670.04 


2482.23 


2362.75 


2259.57 


heuristic 


3371.64 


2671.72 


2483.52 


2362.90 


2276.32 


gap 


12.934% 


0.063% 


0.052% 


0.006% 


0.741% 



Table 2. Optimality gap in the 



-norm for our heuristic on commodity group 2 





K=1 


K=2 


K = 3 


K=4 


K=5 


CPLEX 


3570.3282 


3022.2655 


2622.8209 


2488.1009 


2413.55 


heuristic 


3571.61 


3023.91 


2625.29 


2492.07 


2417.65 


gap 


0.036% 


0.054% 


0.094% 


0.160% 


0.170% 



Table 3 . Optimality gap in the || • || i -norm for our heuristic on commodity group 3 



Cardinality interval: [1 0573, 1 1 686] 



number of sizes: |§p| 
number of branches: 



Sr 



1119 





K = 1 


K=2 


K = 3 


K=4 


K=5 


CPLEX 


4776.36 


4364.63 


4169.94 


4023.60 


3890.87 


heuristic 


5478.19 


4365.47 


4170.23 


4024.55 


3892.35 


gap 


14.694% 


0.019% 


0.007% 


0.024% 


0.038% 



Table 4. Optimality gap in the 



-norm for our heuristic on commodity group 4 



Commodity group 5: 

Cardinality interval: [16744, 18506] 
number of sizes: |Sp| =5 



number of branches: 1!Bt 



1175 





K = 1 


K=2 


K = 3 


K=4 


K = 5 


CPLEX 


4178.71 


3418.37 


3067.74 


2874.70 


2786.69 


heuristic 


4179.23 


3418.87 


3068.25 


2875.21 


2787.21 


gap 


0.013% 


0.015% 


0.017% 


0.018% 


0.019% 



Tab LE 5 . Optimality gap in the 1 1 • 1 1 1 -norm for our heuristic on commodity group 5 



Commodity group 6: 

Cardinality interval: [11 000, 13000] 
number of sizes: |§p| —4 
number of branches: |25p| — 1030 
Commodity group 7: 
CardinaHty interval: [15646, 17293] 
number of sizes: |§p| =5 
number of branches: |23,i| — 1098 
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K = 1 


K=2 


K = 3 


K=4 


K = 5 


CPLEX 


2812,22 


2311,45 


2100,78 


1987,46 


1909,21 


heuristic 


2812,63 


2311,87 


2101,25 


1987,93 


1909,63 


gap 


0.015% 


0.018% 


0.022% 


0.024% 


0.022% 



Table 6. Optimality gap in the 



-norm for our heuristic on commodity group 6 





K = 1 


K=2 


K = 3 


K=4 


K=5 


CPLEX 


4501.84 


3917.96 


3755.20 


3660.32 


3575.55 


heuristic 


4719.06 


3918.46 


3755.70 


3660.84 


3576.04 


gap 


4.825% 


0.013% 


0.013% 


0.014% 


0.014% 



Table 7. Optimality gap in the || • || i -norm for our heuristic on commodity group 7 



Commodity group 8: 

CardinaUty interval: [11274, 12461] 
number of sizes: |Sp| =5 
number of branches: l!Bp| — 989 





K = 1 


K=2 


K = 3 


K=4 


K = 5 


CPLEX 


3191.66 


2771.89 


2575.37 


2424.31 


2331.67 


heuristic 


3579.35 


2772.33 


2575.81 


2424.75 


2332.11 


gap 


12.147% 


0.016% 


0.017% 


0.018% 


0.019% 



Table 8. Optimahty gap in the 



-norm for our heuristic on commodity group 8 



Commodity group 9: 

CardinaUty interval: [921 1 , 10181] 
number of sizes: |§pl =5 
number of branches: 123^1 — 808 





K = 1 


K=2 


K = 3 


K=4 


K = 5 


CPLEX 


3616.71 


3215.17 


2981.02 


2837.66 


2732.29 


heuristic 


3617.09 


3215.53 


3009.01 


2860.85 


2758.39 


gap 


0.010% 


0.011% 


0.939% 


0.817% 


0.955% 



Table 9. Optimality gap in the || • || i -norm for our heuristic on commodity group 9 



Besides these nine test calculations we have done several calculations on our data sets with different 
parameters, we have, e.g., considered case with fewer sizes, fewer branches, smaller or larger cardinality 
intervals, larger k, or other magnitudes for the cardinality interval. The results are from a qualitative point 
of view more or less the same, as for the presented test calculations. 



14 CONSTANTIN GAUL, SASCHA KURZ, AND JORG RAMBAU 

6. Conclusion and Outlook 

Starting from a real world optimization problem we have formalized a new general optimization problem, 
which we call cardinality p-facility location problem. It turns out that this problem is related to several other 
well known standard optimization problems. In Subsection l3.5l we have given an integer linear programming 
formulation which has a very strong LP-relaxation. Nevertheless this approach is quit fast (computing times 
below one hour), there was a practical need for fast heuristics to solve the problem. We have presented one 
such heuristic which performs very well on real world data sets with respect to the optimality gap. 

Some more theoretic work on the cardinality p-facility location problem and its relationships to other 
classical optimization methods may lead to even stronger integer linear programming formulations or faster 
branch-and-bound frameworks enhanced with some graph theoretic algorithms. 

We leave also the question of a good approximation algorithm for the cardinality p-facility location 
problem. Having the known approximation algorithms for the other strongly related classical optimization 
problems in mind, we are almost sure that it should be not too difficult to develop good approximation 
algorithms for our problem. 

For the practical problem the uncertainties and difficulties concerning the demand estimation have to be 
faced. There are several ways to make solutions of optimization problems more robust. One possibility is 
to utilize robust optimization methods. Another possibility is to consider the branch- and size dependent 
demands as stochastic variables and to utilize integer linear stochastic programming techniques. See, e.g., 
[3J or more specifically |29|. These enhanced models, however, will challenge the solution methods a lot, 
since the resulting problems are of a much larger scale than the one presented in this paper. Nevertheless, 
this is exactly what we are looking at next. 
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